Automatic perceptual categorization of disordered connected speech
نویسندگان
چکیده
The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into binary (normal, pathological) or multiple (modal, moderately hoarse, severely hoarse) categories. The multicategory classification according to the perceived degree of hoarseness is considered to be clinically meaningful and desirable given that the reliable perceptual classification by humans of disordered voice stimuli is known to be difficult and time-consuming. The acoustic cues are temporal signal-todysperiodicity ratios as well as mel-frequency cepstral coefficients. The classifiers are support vector machines which have been trained and tested on two connected speech corpora. The binary classification accuracy has been high (98%) for both sets of acoustic cues. The multi-category classification accuracy has been 70% when based on signal-todysperiodicity ratios and 59% when based on mel-frequency cepstral coefficients.
منابع مشابه
Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech
The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into multiple (modal, moderately hoarse, severely hoarse) categories. Support vector machines, used for the classification, have been fed with temporal signal-to-dysperiodicity ratios, the first rahmonic amplitude as well as mel-frequency cepstral coefficients. The si...
متن کاملMulti-band and multi-cue analyses of disordered connected speech
The objective is to analyze vocal dysperiodicities in connected speech produced by dysphonic speakers. The analysis involves a speech variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-todysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the spea...
متن کاملTurning wine into water: Can ordinary speech be artificially nasalized?
Synthetic speech has recently been used to study resonance and voice disorders. The advantage of synthetic speech is that it becomes possible to control and artificially manipulate acoustic variables related to a specific perceptual feature without confounding effects from other co-occurring problems often found in disordered speech. To date, synthesis has been restricted to vowels in isolation...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملAutomatic Prediction of Speech Evaluation Metrics for Dysarthric Speech
During the last decades, automatic speech processing systems witnessed an important progress and achieved remarkable reliability. As a result, such technologies have been exploited in new areas and applications including medical practice. In disordered speech evaluation context, perceptual evaluation is still the most common method used in clinical practice for the diagnosing and the following ...
متن کامل